Dataset

  • Имаме train set с 25 000 картинки на кучета и котки
  • И test set с 12 500 картинки за идентифициране

Нека видим какво имаме в train set-a:

In [172]:
show_cats_and_dogs(0, 6)

Подход

  • Ще използваме convolutional neural network (конвулационна невронна мрежа?) с keras и tensorflow backend
  • С optimizer = RMSprop
  • Output layer-a е sigmoid, защото имаме binary crossentropy

Нека видим и base модела:

In [262]:
optimizer = RMSprop(lr=0.0001)
objective = 'binary_crossentropy'
metrics = ['accuracy']


def catdog():
    model = Sequential()

    model.add(Conv2D(32, (3, 3), padding='same', input_shape=X_train.shape[1:], activation='relu'))
    model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
    model.add(MaxPooling2D(data_format="channels_first", pool_size=(2, 2)))
    #model.add(Dropout(0.25))
    
    for i in range(6, 9):
        model.add(Conv2D(2**i, (3, 3), padding='same', activation='relu'))
        model.add(Conv2D(2**i, (3, 3), padding='same', activation='relu'))
        model.add(MaxPooling2D(data_format="channels_first", pool_size=(2, 2)))

    model.add(Flatten())
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))
    
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))

    model.add(Dense(1))
    model.add(Activation('sigmoid'))

    model.compile(loss=objective, optimizer=optimizer, metrics=metrics)
    return model
In [250]:
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_344 (Conv2D)          (None, 3, 64, 32)         18464     
_________________________________________________________________
conv2d_345 (Conv2D)          (None, 3, 64, 32)         9248      
_________________________________________________________________
max_pooling2d_167 (MaxPoolin (None, 3, 32, 16)         0         
_________________________________________________________________
dropout_118 (Dropout)        (None, 3, 32, 16)         0         
_________________________________________________________________
conv2d_346 (Conv2D)          (None, 3, 32, 64)         9280      
_________________________________________________________________
conv2d_347 (Conv2D)          (None, 3, 32, 64)         36928     
_________________________________________________________________
max_pooling2d_168 (MaxPoolin (None, 3, 16, 32)         0         
_________________________________________________________________
conv2d_348 (Conv2D)          (None, 3, 16, 128)        36992     
_________________________________________________________________
conv2d_349 (Conv2D)          (None, 3, 16, 128)        147584    
_________________________________________________________________
max_pooling2d_169 (MaxPoolin (None, 3, 8, 64)          0         
_________________________________________________________________
conv2d_350 (Conv2D)          (None, 3, 8, 256)         147712    
_________________________________________________________________
conv2d_351 (Conv2D)          (None, 3, 8, 256)         590080    
_________________________________________________________________
max_pooling2d_170 (MaxPoolin (None, 3, 4, 128)         0         
_________________________________________________________________
flatten_39 (Flatten)         (None, 1536)              0         
_________________________________________________________________
dense_115 (Dense)            (None, 256)               393472    
_________________________________________________________________
dropout_119 (Dropout)        (None, 256)               0         
_________________________________________________________________
dense_116 (Dense)            (None, 256)               65792     
_________________________________________________________________
dropout_120 (Dropout)        (None, 256)               0         
_________________________________________________________________
dense_117 (Dense)            (None, 1)                 257       
_________________________________________________________________
activation_39 (Activation)   (None, 1)                 0         
=================================================================
Total params: 1,455,809
Trainable params: 1,455,809
Non-trainable params: 0
_________________________________________________________________

Train

  • 1500 train samples & 500 validation samples
  • 20 епохи на CPU
  • EarlyStopping за да не overfit-ва

~ 50-60% точност:

In [263]:
model = catdog()
early_stop = EarlyStopping(monitor='val_loss', patience=3, verbose=1, mode='auto')
model.fit(X_train, y_train, epochs=10, validation_split=0.25, shuffle=True, callbacks=[early_stop])
Train on 750 samples, validate on 250 samples
Epoch 1/10
750/750 [==============================] - 6s 7ms/step - loss: 1.2257 - acc: 0.5053 - val_loss: 0.7341 - val_acc: 0.4520
Epoch 2/10
750/750 [==============================] - 4s 5ms/step - loss: 0.8017 - acc: 0.4973 - val_loss: 0.6913 - val_acc: 0.5400
Epoch 3/10
750/750 [==============================] - 4s 6ms/step - loss: 0.7453 - acc: 0.5280 - val_loss: 0.8077 - val_acc: 0.4560
Epoch 4/10
750/750 [==============================] - 4s 6ms/step - loss: 0.7517 - acc: 0.5280 - val_loss: 0.6986 - val_acc: 0.5680
Epoch 5/10
750/750 [==============================] - 4s 5ms/step - loss: 0.7465 - acc: 0.5293 - val_loss: 0.6810 - val_acc: 0.5680
Epoch 6/10
750/750 [==============================] - 4s 6ms/step - loss: 0.7388 - acc: 0.5187 - val_loss: 0.7170 - val_acc: 0.4600
Epoch 7/10
750/750 [==============================] - 5s 7ms/step - loss: 0.7398 - acc: 0.5147 - val_loss: 0.6967 - val_acc: 0.4560
Epoch 8/10
750/750 [==============================] - 4s 6ms/step - loss: 0.7331 - acc: 0.5347 - val_loss: 0.6930 - val_acc: 0.5240
Epoch 00008: early stopping
Out[263]:
<keras.callbacks.History at 0x7f62932cb748>

Да видим какво ще каже моделът и за тестовите данни:

In [265]:
predictions = model.predict(test, verbose=0)
for i in range(0, 10):
    if predictions[i, 0] >= 0.5: 
        print('I am {:.2%} sure this is a Dog'.format(predictions[i][0]))
    else:
        print('I am {:.2%} sure this is a Cat'.format(1-predictions[i][0]))
        
    plt.imshow(test[i].T)
    plt.show()
I am 54.37% sure this is a Dog
I am 50.39% sure this is a Dog
I am 53.18% sure this is a Cat
I am 50.02% sure this is a Dog
I am 51.67% sure this is a Dog
I am 51.41% sure this is a Dog
I am 54.33% sure this is a Cat
I am 51.86% sure this is a Cat
I am 51.12% sure this is a Dog
I am 51.49% sure this is a Cat